2 research outputs found

    Reusing Data and Metadata to Create New Metadata Through Machine-Learning & Other Programmatic Methods

    Get PDF
    Recent improvements in natural language processing (NLP) enable metadata to be created programmatically from reused original metadata or even the dataset itself. Transfer-learning applied to NLP has greatly improved performance and reduced training data requirements. In this talk, well compare machine-generated metadata to human-generated metadata and discuss characteristics of metadata and data archives that affect suitability for machine-learning reuse of metadata. Where as human-generated metadata is often populated once, populated from the perspective of data supplier, populated by many individuals with different words for the same thing, and limited in length, machine-generated metadata can be updated any number of times, generated from the perspective of any user, constrained to a standardized set of terms that can be evolved over time, and be any length required. Machine-learning generated metadata offers benefits but also additional needs in terms of version control, process transparency, human-computer interaction, and IT requirements. As a successful example, well discuss how a dataset of abstracts and associated human-tagged keywords from a standardized list of several thousand keywords were used to create a machine-learning model that predicted keyword metadata for open-source code projects on code.nasa.gov. Well also discuss a less successful example from data.nasa.gov to show how data archive architecture and characteristics of initial metadata can be strong controls on how easy it is to leverage programmatic methods to reuse metadata to create additional metadata

    Facies interpretation and geochronology of diverse Eocene floras and faunas, northwest Chubut Province, Patagonia, Argentina

    No full text
    The Eocene Huitrera Formation of northwestern Patagonia, Argentina, is renowned for its diverse, informative, and outstandingly preserved fossil biotas. In northwest Chubut Province, at the Laguna del Hunco locality, this unit includes one of the most diverse fossil floras known from the Eocene, as well as significant fossil insects and vertebrates. It also includes rich fossil vertebrate faunas at the Laguna Fría and La Barda localities. Previous studies of these important occurrences have provided relatively little sedimentological detail, and radioisotopic age constraints are relatively sparse and in some cases obsolete. Here, we describe five fossiliferous lithofacies deposited in four terrestrial depositional environments: lacustrine basin floor, subaerial pyroclastic plain, vegetated, waterlogged pyroclastic lake margin, and extracaldera incised valley. We also report several new 40Ar/39Ar age determinations. Among these, the uppermost unit of the caldera-forming Ignimbrita Barda Colorada yielded a 40Ar/39Ar age of 52.54 ± 0.17 Ma, ∼6 m.y. younger than previous estimates, which demonstrates that deposition of overlying fossiliferous lacustrine strata (previously constrained to older than 52.22 ± 0.22 Ma) must have begun almost immediately on the subsiding ignimbrite surface. A minimum age for Laguna del Hunco fossils is established by an overlying ignimbrite with an age of 49.19 ± 0.24 Ma, confirming that deposition took place during the early Eocene climatic optimum. The Laguna Fría mammalian fauna is younger, constrained between a valley-filling ignimbrite and a capping basalt with 40Ar/39Ar ages of 49.26 ± 0.30 Ma and 43.50 ± 1.14 Ma, respectively. The latter age is ∼4 m.y. younger than previously reported. These new ages more precisely define the age range of the Laguna Fría and La Barda faunas, allowing greatly improved understanding of their positions with respect to South American mammal evolution, climate change, and geographic isolation.Centro de Investigaciones Geológica
    corecore